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Abstract 

We consider the problem of estimating the division rate of a size-structured pop- 
ulation in a nonparametric setting. The size of the system evolves according to a 
transport-fragmentation equation: each individual grows with a given transport rate, 
and splits into two offsprings of the same size, following a binary fragmentation process 
with unknown division rate that depends on its size. In contrast to a deterministic 
inverse problem approach, as in [231 13] , we take in this paper the perspective of sta- 
tistical inference: our data consists in a large sample of the size of individuals, when 
the evolution of the system is close to its time-asymptotic behavior, so that it can be 
related to the eigenproblem of the considered transport-fragmentation equation (see 
|22| for instance). By estimating statistically each term of the eigenvalue problem and 
by suitably inverting a certain linear operator (see U), we are able to construct a more 
realistic estimator of the division rate that achieves the same optimal error bound as 
in related deterministic inverse problems. Our procedure relies on kernel methods with 
automatic bandwidth selection. It is inspired by model selection and recent results of 
Goldenschluger and Lepski [13l [14] . 
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1 Introduction 



1.1 Motivation 

Structured models have long served as a representative deterministic model used to de- 
scribe the evolution of biological systems, see for instance [19] or [20] and references therein. 
In their simplest form, structured models describe the temporal evolution of a population 
structured by a biological parameter such as size, age or any significant trait, by means 
of an evolution law, which is a mass balance at the macroscopic scale. A paradigmatic 
example is given by the transport-fragmentation equation in cell division, that reads 

' d d 

-^n{t,x) + —[go{x)n{t,x)) + B{x)n{t,x) = AB{2x)n{t,2x), t>0, x > 0, 

' gn{t,x = 0) = 0, t > 0, ^^'^^ 
, n{t = 0,x) = nP{x), x>0. 

The mechanism captured by Equation (jl.ip can be described as a mass balance equa- 
tion (see [U [20]): the quantity of cells n(t, x) of size x at time t is fed by a transport 
term goix) that accounts for growth by nutrient uptake, and each cell can split into two 
offsprings of the same size according to a division rate B{x). Supposing gQ{x) = K.g{x), 
where we suppose a given model for the growth rate g{x) known up to a multiplicative 
constant k > 0, and experimental data for n{t,x), the problem we consider here is to 
recover the division rate B{x) and the constant k. 

In [23] , Perthame and Zubelli proposed a deterministic method based on the asymptotic 
behavior of the cell amount n{t,x) : indeed, it is known (see e.g. [211 I22j ) that under 
suitable assumptions on g and B, by the use of the general relative entropy principle (see 
[19]), one has 

POO 

/ |n(t,x)e"^* - (n°,(/.)iV(x)U(x)(ix — ^0 (1.2) 

where {n^,(j)) = J {y)(j){y)dy and cj) is the adjoint eigenvector (see [ZI])- The density 
is the first eigenvector, and (A, N) the unique solution of the following eigenvalue problem 

n-^{g{x)N{x))+\N{x) = ABN{2x)-BN{x), x > 0, 
< (1.3) 

^ B(0)iV(0) = 0, /iV(x)cix = l, iV(x)>0, A>0. 

Moreover, under some supplementary conditions, this convergence occurs exponentially 
fast (see [22]). Hence, in the rest of this article, we work under the following analytical 
assumptions. 
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Assumption 1 (Analytical assumptions). 

1. For the considered nonnegative functions g and B and for k > 0, there exists a 
unique eigenpair (A, N) solution of Problem (jl.3p . 

2. This solution satisfies, for all p >0, J x'PN{x)dx < oo and < J g{x)N{x)dx < oo. 

3. The functions N and gN belong to W"^^ with s > 1, and in particular || A||^ < oo 
and II ((^A)'!!^ < oo. (W^^^ denotes the Sobolev space of regularity s + 1 measured in 
h'^-norm.) 

4. We have g G L°°(]R+) with R+ = [0,oo). 

Hereafter ||»||2 and ||»||oo denote the usual and L°° norms on M+. Assertions [1] and 
[2] are true under the assumptions on g and B stated in Theorem 1.1 of [3] , under which 
we also have A € L°°. Assertion [3] is a (presumably reasonable) regularity assumption, 
necessary to obtain rates of convergence together with the convergence of the numerical 
scheme. Assertion U] is restrictive, but mandatory in order to apply our statistical approach. 

Thanks to this asymptotic behavior provided by the entropy principle (|1.2p . instead 
of requiring time-dependent data n{t,x), which is experimentally less precise and more 
difficult to obtain, the inverse problem becomes: How to recover (k, B) from observations 
on (A, A) ? In |23| H]. as generally done in deterministic inverse problems (see [5])) it was 
supposed that experimental data were pre-processed into an approximation A^ of A with 
an a priori estimate of the form ||A — A^H < e for a suitable norm ||»||. Then, recovering 
B from Ag becomes an inverse problem with a certain degree of ill-posedness. Prom a 
modelling point of view, this approach suffers from the limitation that knowledge on A is 
postulated in an abstract and somewhat arbitrary sense, that is not genuinely related to 
experimental measurements. 

1.2 The statistical approach 

In this paper, we propose to overcome the limitation of the deterministic inverse problems 
approach by assuming that we have n data, each data being obtained from the measure- 
ment of an individual cell picked at random, after the system has evolved for a long time 
so that the approximation n{t,x) ~ N[x)e^^ is valid. This is actually what happens if 
one observes cell cultures in laboratory after a few hours, a typical situation for E. Coli 
cultures for instance, provided, of course, that the underlying aggregation-fragmentation 
equation is valid. 

Each data is viewed as the outcome of a random variable Aj, each Aj having probability 
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distribution N{x)dx. We thus observe (Xi, . . . , X^), with 

n 

P(Xi e dxi,...,Xn e =J\_N{xi)dxi, 

1=1 

and where P(») hereafter denotes probabiht}0. We assume for simphcity that the ran- 
dom variables Xi are defined on a common probabihty space P) and that they 
are stochastically independent. Our aim is to build an estimator of B{x), that is a func- 
tion X IBn{x,Xi, . . . ,Xn) that approximates the true B{x) with optimal accuracy and 
nonasymptotic estimates. To that end, consider the operator 

d 

{X,N) '^1{X,N){x) := K—{g{x)N{x)) + XN{x), x > 0. (1.4) 

From representation (|1.3p . we wish to find B, solution to 1.(X,N) = C{BN), where 

£((/p)(x) :=4(/.(2x)-(^(x), (1.5) 

based on statistical knowledge of (A, N) only. Suppose that we have preliminary estimators 
L and N of respectively T(A, N) and A^, and an approximation £^ of Then we can 
reconstruct B in principle by setting formally 

N 

This leads us to distinguish three steps that we briefly describe here. The whole method 
is fully detailed in Section [2j 

The first and principal step is to find an optimal estimator L for 1.{X,N). To do so, 
the main part consists in applying twice the Goldenschluger and Lepski's method [H] 
(GL for short). This method is a new version of the classical Lepski method [9l \T0\ [TTl 
I12j . Both methods are adaptive to the regularity of the unknown signal and the GL 
method furthermore provides with an oracle inequality. For the unfamiliar reader, we 
discuss adaptive properties later on, and explain in details the GL method and the oracle 
point of view in Section 2. 

1. First, we estimate the density X by a kernel method, based on a kernel function K. 
We define = iV^ where is defined by (j2.ip and the bandwidth h is selected au- 
tomatically by ()2.3p from a properly-chosen set Ti. (see Section [27T] for more details). 
A so-called oracle inequality is obtained in Proposition [2] measuring the quality of 
estimation of N by N. Notice that this result, which is just a simplified version of 
|14j . is valid for estimating any density, since we have only assumed to observe an 
n— sample of N, so that this result can be considered per se. 

^In the sequel, we denote by E(«) the expectation operator with respect to P(«) likewise. 
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2. Second, we estimate the density derivative (up to g) D = -g^{gN), again by a kernel 
method with the same kernel K as before, and select an optimal bandwidth h given 
by Formula (|2.5|) similarly. This defines an estimator D := Dj^ where Df^ is specified 
by (j2.4p . and yields an oracle inequality for D stated in Proposition[3l In the saemway 
as for this result has an interest per se and is not a direct consequence of 



From there, it only remains to find estimators of A and k. To that end, we make the follow- 
ing a priori (but presumably reasonable) Assumption [2] on the existence of an estimator 
A„ of A. 

Assumption 2 (Assumption on A„). There exists some q > I such that 

£X,n = (E[|A„ - Xl'^jf" < OO, i?A,„ = ml'] < OO. 

Indeed, in practical cell culture experiments, one can track n individual cells that have 
been picked at random through time. By looking at their evolution, it is possible to infer 
A in a classical parametric way, via an estimator A„ that we shall assume to possess from 
now orH. Based on the following simple equality 

L xN(x)dx 

..Ap^OT where p,(iV) = -j^^j-^^^. (1.6) 



obtained by multiplying (II. 3p by x and integrating by part, we then define an estimator 
kn by (j2.8p . Finally, defining L = knD + XnN ends this first step. The second step consists 
in the formal inversion of C and its numerical approximation: For this purpose, we follow 
the method proposed in [3] and recalled in Section [2^ To estimate H := BN, we state 

H:=C^HL) (1.7) 

where is defined by (|2.10p on a given interval [0,T]. A new approximation result 
between and is given by Proposition 21 The third and final step consists in 

setting B := clipping this estimator in order to avoid explosion when N becomes too 
small, finally obtaining 

B{x) := max(min(i?(x), \/n), —y/n). (1-8) 



^Mathematically sepaking, this only amounts to enlarge the probability space to a rich enough structure 
that captures this estimator. We do not pursue that here. 
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1.3 Rates of convergence 

Because of the approximated inversion of L on [0, T], we will have access to error bounds 
only on [0,T]. We set ||/||2t = Jo f'^i^)dx. for the L^-norm restricted to the interval 

[0,T]. If the fundamental (yet technical) statistical result is the oracle inequality for H 
stated in Theorem [1] (see Section 12. 5p , the relevant part with respect to existing works 
in the non-stochastic setting [U [23] is its consequence in terms of rates of convergence. 
For presenting them, we need to assume that the kernel K has regularity and vanishing 
moments properties. 

Assumption 3 (Assumptions on K). The kernel K is differentiable with derivative K' . 
Furthermore, J K{x)dx = 1 and \\K^^ and ||-f^'||2 are finite. Finally, there exists a positive 
integer rriQ such that J K{x)xPdx = for p = 1, . . . , rriQ — 1 and /(mg) := / \x\"^° K{x)dx 
is finite. 

Then our proposed estimators satisfy the following properties. 

Proposition 1. Under Assumptions{I\ [H let us assume that R\^n and \fne\^n are 

bounded uniformly in n and specify C^^ with k = n f. Assume further that the family of 
bandwidth H = Ti = {D~^ : D = Dmin, ^max} depends on n is such that 1 < -Dmin < 
j^i/(2mo+i) j^i/5 < Djj^s^x < n^^"^ for all n. Then H satisfies, for all s € [l;mo — 1] 

E [\\H - H\\l^] = 0{n-^), (1.9) 

Furthermore, if the kernel K is Lipschitz-regular, if there exists an interval [a, h] in (0, T) 
such that 

[m,M] := [ inf N{x), sup N{x)] C (0,oo), Q := sup \H{x)\ < oo, 

x<^[a,b] xe[a,6] xe[a,b] 

and i/ln(n) < L>min < n^/^^^o+i) „i/5 < ^^^^ < (n/ ln(n))i/(^+'^') for some rj > 0, 
then B satisfies, for all s G [l,rfio — 1], 

E[\\iB - B)l[,^,£] = 0{n'^). (1.10) 

1.4 Remarks and comparison to other works 

1) Let us establish formal correspondences between the methodology and results when 
recovering B from (|1.3p from the point of view of statistics or PDE analysis, j After 
renormalization, we obtain the rate n~'^^^'^^~^'^^ for estimating B, and this corresponds to 
ill-posed inverse problems of order 1 in nonparametric statistics. We can make a parallel 
with additive deterministic noise following Nussbaum and Pereverzev [18] (see also [H] 
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and the references therein). Suppose we have an approximate knowledge of N and A up 
to deterministic errors Ci ^ and (^2 £ 1^ with noise level e > 0: we observe 

N, = N + eCi, ||Ci||2<l, (1.11) 

and Ae = A + £^2, IC2I ^ 1- From the representation 

C-^1{N,X) 
^ = N ' 

where T(A^, A) is defined in ()1.4p . we have that the recovery of T(A^, A) is ill-posed in the 
terminology of Wahba [25] for it involves the computation of the derivative of N. Since 
jO is bounded with an inverse bounded in and the dependence in A is continuous, the 
overall inversion problem is ill-posed of degree a = 1. By classical inverse problem theory 
for linear case^, this means that if G W*, the optimal recovery rate in L^-error norm 
should be = (gee also the work of Doumic, Perthame and collaborators 

mm- 

Suppose now that we replace the deterministic noise Ci by a random Gaussian white 
noise: we observe 

Ns = N + eM (1.12) 

where B is a Gaussian white noise, i.e. a random distribution in W~^/'^ that operates on test 
functions 93 E and such that M{{p) is a centered Gaussian variable with variance \\^\\2- 
Model (I1.12P serves as a representative toy model for most stochastic error models such 
as density estimation or signal recovery in the presence of noise. Let us formally introduce 
the a-fold integration operator X" and the derivation operator d. We can rewrite ()1.12p 
as 

Ne =I^{dN)+eM 
and applying X^/^ to both side, we (still formally) equivalently observe 

Ze := X^/^iVe = l^/'^{dN) + eX'^/'^M. 

We are back to a deterministic setting, since in this representation, we have that the noise 
eX^/^B is in L^. In order to recover dN from Zg, we have to invert the operator X'^/^, 
which has degree of ill-posedness 3/2. We thus obtain the rate 

gS/(s+3/2) ^ g2s/(2s+3) ^ ^-s/(2s+3) 

for the calibration e = n~^/^ dictated by (jl.l2p when we compare our statistical model 
with the deterministic perturbation (see for instance [17] for establishing formally the cor- 
respondence e = is a general setting). This is exactly the rate we find in Proposition 
[U the deterministic error model and the statistical error model coincide to that extenl|l|. 

^although here the problem is nonhnear, but that will not affect the argument. 

* The statistician reader wilU note that the rate n^'''''^^^'*' is also the minimax rate of convergence 
when estimating the derivative of a density, see [6]. 
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2) The estimators H and B do not need the exact knowledge of s as an input to recover 
this optimal rate of convergence. We just need to know an upper bound mg — 1 to choose 
the regularity of the kernel K. This capacity to obtain the optimal rate without knowing 
the precise regularity is known in statistics as adaptivity in the minimax sense (see |24j 
for instance for more details). It is close in spirit to what the discrepancy principle can do 
in deterministic inverse problems [5]. However, in the deterministic framework, one needs 
to know the level of noise e, which is not realistic in practice. In our statistical framework, 
this level of noise is linked to the size sample n through the correspondence e = n~^/^. 

3) Finally, note that the rate is polynomial and no extra-logarithmic terms appear, as 
it is often the case when adaptive estimation is considered (see [9l [TU[ [TT\ [T2] ) . 

The next section explains in more details the GL approach and presents our estimators 
to a full extent, including the fundamental oracle inequalities. It also elaborates on the 
methodology related to oracle inequality. The main advantage of oracle inequalities is 
that they hold nonasymptotically (in n) and that they guarantee an optimal choice of 
bandwidth with respect to the selected risk. Section [3] is devoted to numerical simulations 
that illustrate the performance of our method. Proofs are delayed until Section SI 

2 Construction and properties of the estimators 
2.1 Estimation of N by the GL method 

We first construct an estimator of N. A natural approach is a kernel method, which is all 
the more appropriate for comparisons with analytical methods (see [4j for the deterministic 
analogue). The kernel function K should satisfy the following assumption, in force in the 
sequel. 

Assumption 4 (Assumption on the kernel density estimator). : M — )• M is a contin- 
uous function such that f K{x)dx = 1 and f K'^{x)dx < oo. 

For h > and x G M, define 

1 " 

Nhix):=-y^Kh{x-X,), (2.1) 

where K^ix) = h~^K(h~^x). Note in particular that K^N^) = * N, where * denotes 
convolution. We measure the performance of Nh via its squared integrated error, i.e. the 
average distance between N and iV/^. It is easy to see that 

E[\\N-Nh\\2] < \\N - Kh* N\\^+E[\\Kh* N - Nhh], 
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with 



1 





Applying the Cauchy-Schwarz inequality, we obtain 



E[||iV-iV^||2] < ||iV-i^,,*iV||2 + 



1 




nh 



The first term corresponds to a bias term, it decreases when h ^ 0. The second term 
corresponds to a variance term, which increases when /i — t- 0. If one has to choose h in a 
family Ti of possible bandwidths, the best choice is h where 



This ideal compromise h is called the "oracle": it depends on and then cannot be used 
in practice. Hence one wants to find an automatic (data-driven) method for selecting this 
bandwidth. The Lepski method [H [TOl [TT\ [12] is one of the various theoretical adaptive 
methods available for selecting a density estimator. In particular it is the only known 
method able to select a bandwidth for kernel estimators. However the method do not 
usually provide a non asymptotic oracle inequality. Recently, Goldenschluger and Lepski 
|13| developed powerful probabilistic tools that enable to overcome this weakness and 
that can provide with a fully data-driven bandwidth selection method. We give here a 
practical illustration of their work: how should one select the bandwidth for a given kernel 
in dimension 1? 

The main idea is to estimate the bias term by looking at several estimators. The 
method consists in setting first, for any x and any h, h' > 0, 




1 



(2.2) 



Nh,h'{x) := - y2iKh * Kh>){x - Xi) = {K^ * Nh')ix). 



i=l 



Next, for any h £ Ti, define 



A{h) 




Kh} 



+ 




X 
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where, given e > 0, we set % := (1 + e)(l + HKHi). The quantity A{h) is actually a good 
estimator of || - i^/^ * iV||2 up to the term ||i^||^ (see (fOj) and in Section H]). The 
next step consists then in setting 

^ := argmin|A(/i) + ^L||K||2|, (2.3) 
hen y/nh 

and our final estimator of N is obtained by putting N := N^. Let us specify what we are 
able to prove at this stage. 

Proposition 2. Assume N G L°° and work under Assumption\^ If % C {D~^,D = 
1, . . . , -Dmax} with Dmax = for 6 > 0, then, for any q > 1, 

E [||iV - NWl''] < C{q)x'' mf { \\K,, * N - N\\l'' + Ll^} + C,n~i, 

where C{q) is a constant depending on q and Ci is a constant depending on q, e, 6, \\K\\2, 
\\K\\i and ||A^||oo- 

The previous inequality is called an oracle inequality, for we have E[||iV — A^||2] < 
(E[||iV — A^||2'^])-'^/(^^) and h is performing as well as the oracle h up to some multiplicative 
constant. In that sense, we are able to select the best bandwidth within our family T-L. 

Remark 1. As compared to the results of Goldenschluger and Lepski in I13f . we do not 
consider the case where % is an interval and we do not specify K except for Assumption\^ 
This simpler method is more reasonable from a numerical point of view, since estimating N 
is only a preliminary step. The probabilistic tool we use here is classical in model selection 
theory (see Section^ and HE/) o,nd actually, we do not use directly 1 13}/ . In particular 
the main difference is that, in our specific case, we are able to get max(T-L) fixed whereas 
Goldenschluger and Lepski 11 3}/ require max('H) to tend to with n. The price to pay is 
that we obtain a uniform bound (see LemmaUlin Section which is less tight, but that 
will be sufficient for our purpose. 

2.2 Estimation of ■^(^g{x)N{x)) by the GL method 

The previous method can of course be adapted to estimate 

Dix) ■.= -^{g(x)Nix)). 

We adjust here the work of [Tl] to the setting of estimating a derivative. We again use 
kernel estimators with more stringent assumption^ on K. 

^For sake of simplicity we use the same kernel to estimate A'' and D but this choice is not mandatory. 
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Assumption 5 (Assumption on the kernel of the derivative estimator). The function 
K is differentiable, J K{x)dx = 1 and J{K'{x))'^dx < oo. 



For any bandwidth /i > 0, we define the kernel estimator of D as 



nh 

Indeed 



(2.4) 



E{Dh{x)) = j K'^{x-u)g{u)N{u)du 

= {K'^^{gN)){x) = {KH^{gN)'){x). 

Again we can look at the integrated squared error of D. We obtain the following upper 
bound: 

nWDh-DU < \\D - Kh* D\\2+n\\Kh^ D - bhhl 

with 

n 



miKh^D-Dhg = —E[ [Y^(^g{X,)K'j,{x-X,)-E{g{X,)KUx-X, 

^ 1=1 

1 /" " 

= - / [{9iXi)K{x - X,) - E {g{X,)K',,{x - X., 

^ 1=1 

< ^E[J gHXi)K'i,\x-Xi)dx] 



2 ■ 

dx 



dx 



11^ II T^/ 11*^ II 1 1 ^ 1 1 T^f 1 1 ^ 



^ II" iioo m h\\2 



n nh^ 
Hence, by Cauchy-Schwarz inequality 

nWD-Dhh] < \\D-Kf,*D\\.^ + ^=\\g\\J\K'\\^. 

Once again, there is a bias-variance decomposition, but now the variance term is of order 
J-^ ll Er^|L||q|| . We therefore define the oracle by 

\/nh^ M M ^ M M oo 

I := argmin,,^ (III) -Kn^D\\^ + -^\\g\\J\K'\\^] . (2.5) 

Now let us apply the GL method in this case. Let 7^ be a family of bandwidths. We set 
for any h, h' > 0, 

1 " 

Dh,h'{x) := -y^g{Xi){Kh*Kh')'{x-Xi) 

i=l 
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and 

A{h) := sup {\\Dh,h' - Dh'h ^=||c/||oo ll-fs^'lb} , , 

h'&H Vnh'-^ 

where, given e > 0, we put x ■= + + Finally, we estimate D by using 

D := D-^ with 

h := argmin^g^ {A{h) + ^=||5r||oo||i^'||2}. (2.6) 

V nn-^ 

As before, we are able to prove an oracle inequality for D. 

Proposition 3. Assume N G L°^. Work under Assumption \5i If "H = {D^^,D = 
1 . . . , -Dmax}; with -Dmax = V^^n for 6 > 0, then for any q > 1, 

E [\\D - DWl'^] < C{q)x"^ inf { ||K, ^D- D\\l'^ + (kM^f^] + 

where C{q) is a constant depending on q and C\ is a constant depending on q, e, 6, \\K'\\^, 
\\K'\\,, \\q\\ and \\N\\ . 

II lll'll^lloo II lloo 

2.3 Estimation of k (and A) 

As mentioned in the introduction, we will not consider the problem of estimating A and we 
work under Assumption [2) an estimator A„ of A is furnished by the practitioner prior to 
the data processing for estimating B. It becomes subsequently straightforward to obtain 
an estimator of k by estimating pg{N), see the form of (jl.6p . We estimate Pg{N) by 



(2.7) 



T.i=i9{Xi) + c 

where c > is a (small) tuning constanlH. Next we simply put 

kn = XnPn- (2. 

2.4 Approximated inversion of C 

From (12. 7p . the right-hand side of p.3j) is consequently estimated by 



^In practice, one can take c = 0. 
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It remains to formally apply the inverse operator £ ^. However Given Lp, the dilation 
equation 

C['il}) {x) = 4V'(2x) - il){x) = x G M+ (2.9) 

admits in general infinitely many solutions, see Doumic et al. [1], Appendix A. Neverthe- 
less, if (/? € L^, there is a unique solution -i/' G to (j2.9p . see Proposition A.l. in [3], and 
moreover it defines a continuous operator C^^ from to L^. Since K}^ and gK'^^ belong 
to L^, one can define a unique solution to (j2.9p when 99 = knD + A„iV. This inverse is 
not analytically known but we can only approximate it via the fast algorithm described 
below. 

Given T > and an integer k > 1, we construct a linear operator that maps a 
function G into a function with compact support in [0, T] as follows. Consider the 
regular grid on [0,T] with mesh k~^T defined by 

= xo,fc < xi,fc < • • • < Xi^k ■■= < ... < Xk,k = T. 

We set 

^i,k '■= Tf; (p{x)dx for i = 0, . . . ,k - 1, 

^ J Xi,k 

and define by induction the sequence 



Hi^k{^) := ^(ifi/2,fc(93) + (Pi/2,k), i = 0,...,k-l, 



what gives, for i = and i = 1 

Hoif) ■■= l^o,k, Hi{ip) := ^ipQ.k + j^i,k- 

Finally, we define 

^.T^(^)(x) ■.= Y,H.Mli,.^„x^^^,,)i^). (2.10) 

i=0 

As stated in the introduction, we eventually estimate H = BN by 

H = Cl\knD + \nN). 

The stability of the inversion is given by the fact that : — t- is continuous, see 
Lemma m in Section [4.31 and by the following approximation result between C^^ and ^C^^- 



^for any sequence Ui,i = 1,2,..., we define 



_ f if * is even 

■ ^(tt(i-i)/2 + ^t{i+i)/2) otherwise. 
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Proposition 4. Let T > and tp G . Let C '^{'f) denote the unique solution of (|2.9p 
belonging to L^. We have for k>l: 

with C < 

Hence, Cj^^ behaves nicely over sufficiently smooth functions. Moreover the estimation 
of N and the estimators kn and A„ are essentially regular. Finally we estimate B as 
stated in (jl.Sp . The overall behaviour of the estimator is finally governed by the quality 
of estimation of the derivative D, which determines the accuracy of the whole inverse 
problem in all these successive steps. 

2.5 Oracle inequalities for H 

We are ready to state our main result, namely the oracle inequality fulfilled by H. 

Theorem 1. Work under Assumptions\^ andlM o.nd let K a kernel satisfying Assumptions 
g] and\^ Define U C -D = 1 . . . , Anax} with L>max = 5n for 5 > Q and % C {L>"^ : 

D = 1, -Dmax} with Dmax = V^n for 6 > 0. For k > 1 and T > 0, let us define C^^ by 
()2.10p on the interval [0,T]. Finally, define the estimator 

where and Dj^ are kernel estimators defined respectively by (12. ip and (12. 4p . and where 
we have selected h and h by (|2.3p and (|2.6p . Moreover take k.„ as defined by (|2.7p and 
()2.8p for some c > 0. 

The following upper bound holds for any n : 

E[||F-if||Jj <Ci{/Ri;inf [\\Kh*D-D\\l + (}^AJ^y^ 

+ inf [\\K, ^N-N\\l + i^^Y] + 4 n + (dl^llwi + \\gN\\^.)^y} + C72n-f , 

where C\ is a constant depending on q, g, N, e, e, and c; and C2 is a constant 

depending on q, g, N, e, e, 6, 5, H-f^Hgi ||-^'||2' 

1) Note that the upper bound quantifies the additive cost of each step used in the 
estimation method. The first part is an oracle bound on the estimation of D times the 
size of the estimator of A. The second part is the oracle bound for N. Of course, the 
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results are sharp only if A is good, which can be seen through £A,n- Finally, the bound is 
also governed by the approximated inversion through the only term where k appears. The 
last term is just a residual term, that will be in most of the cases negligible with respect 
to the other terms. In particular since all the previous errors are somehow unavoidable, 
this means that, as far as our method is concerned, our upper bound is the best possible 
that can be achieved in order to select the different bandwidths /i, up to multiplicative 
constants. Moreover, one can see how the limitation in k influences the method and how 
large k needs to be chosen to guarantee that the main error comes from the fact that one 
estimates a derivative. 

2) The result holds for any n. In particular, we expect the method to perform well for 
small sample size, see the numerical illustration below. This also shows that we are able 
to select a good bandwidth as far as the kernel K is fixed, even if there is no assumption 
on the moments of K and consequently on the approximation properties of K. In the next 
simulation section, we focus on a Gaussian kernel which has only one vanishing moment 
(hence one cannot really consider minimax adaptation for regular function with it) but for 
which the bandwidth choice is still important in practice. The previous result guarantees 
an optimal bandwidth choice even for this kernel, up to some multiplicative constant. 

3) From this oracle result, we can easily deduce the rates of convergence of Proposition 
[T]at the price of further assumptions on the kernel i.e. Assumption [3] defined in the In- 
troduction. Section 1.2.2 of [Mj recalls how to build compactly supported kernels satisfying 
Assumption [3l If mo satisfies Assumption [3l then for any s < mo, for any / E W^, 

\\Kh^f-f\\2<C\\f\\wsh', 

where C is a constant that can be expressed by using K and mo (see Theorem 8.1 of [8]). 
Now, it is sufficient to choose h = of order n"^/^^'^"^^^ to obtain ()1.9p . The complete 
proof of Proposition [1] is delayed until the last section. 

3 Numerical illustration 

Let us illustrate our method through some simulations. 
3.1 The numerical protocol 

First, we need to build up simulated data: to do so, we depart from given k and B 
on a regular grid [0, dx, . . . , Xm] and solve the direct problem by the use of the so-called 
power algorithm to find the corresponding density N and the principal eigenvalue A (see for 
instance [3] for more details). We check that N almost vanishes outside the interval [0, Xm\, 
else, Xm has to be increased. The density N being given on this grid, we approximate it 
by a spline function, and build a n-sample by the rejection sampling algorithm. For the 
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sake of simplicity, we do not simulate an approximation on A and keep the exact value, 
thus leading, in the estimate of Theorem [H to R\^n = -^^'^ and A„ = A. 

We then follow step by step the method proposed here and detailed in Section [2l 

1. The GL method for the choice of = N^. We take the classical Gaussian kernel 
K{x) = (27r)"i/2exp ( — set -Dmax = n and limit ourselves to a logarithmic 
sampling ^ = {1, 1/2, . . . , 1/9, 1/10, 1/20, . . . , 1/100, 1/200, . . . , 1/n} in order to re- 
duce the cost of computations (The GL method is indeed the most time-consuming 
step in the numerical protocol). 

2. The GL method for the choice of D^- The procedure is similar except that we choose 
here D^ax = V^. The selected bandwidths h and h can be different. We check that 
the GL method does not select an extremal point of T-L. 

3. The choice of k„, as defined by (12. Sp . 

4. The numerical scheme described in Section [2.41 and in [4] for the inversion of C. 

5. The division by and definition of B as described in (jl.Sp . 

At each step, we compare, in L^-norm, the reconstructed function and the original one: 
N vs N, £{gN) vs ^{gN), H vs BN and finally B vs B. 

3.2 Results on simulated data 

We first test the three cases simulated in [3] in which the numerical analysis approach 
was dealt with. Namely on the interval [0,4], we consider the cases where g = 1 and first 
B = Bi = 1, second B{x) = B2{x) = 1 for x < 1.5, then linear to B2{x) = 5 for x > 1.7. 
This particular form is interesting because due to this fast increase on i?, the solution 
N is not that regular and exhibits a 2-peaks distribution (see Figure [1]). Finally, we test 
B{x) = Bs{x) = exp(-8(x - 2)^) + 1. 

In Figures [1] and [21 we show the simulation results with n = 5.10^ (a realistic value 
for in vitro experiments on E. Coli for instance) for the reconstruction of A^, -^(gN), BN 
and B. 

One notes that the solution can well capture the global behavior of the division rate B, 
but, as expected, has more difficulties in recovering fine details (for instance, the difference 
between Bi and -B3) and also gives much more error when B is less regular (case of B2). 
One also notes that even if the reconstruction of N is very satisfactory, the critical point 
is the reconstruction of its derivative. Moreover, for large values of x, even if and its 
derivative are correctly reconstructed, the method fails in finding a proper division rate B. 
This is due to two facts: first, A^ vanishes, so the division by A^ leads to error amplification. 
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Figure 1: Reconstruction of (left) and of -^{gN) (right) obtained with a sample of 
n = 5.10^ data, for three different cases of division rates B. 

Second, the values taken by B{x) for large x have little influence on the solutions N of the 
direct problem: whatever the values of -B, the solutions N will not vary much, as shown 
by Figured] (left). A similar phenomenon occurred indeed when solving the deterministic 
problem in [3] (for instance, we refer to Fig. 10 of this article for a comparison of the 
results). 

We also test a case closer to biological true data, namely the case B{x) = and 
g{x) = X. The results are shown on Figures [3] and [4] for n-samples of size 10^, 5.10^, 10^ 
and 5.101 

One notes that reconstruction is already very good for when n = 10'^, unlike the 
reconstruction of -^{gN) that requires much more data. 

Finally, in Table 13.21 we give average error results on 50 simulations, for n = 1000, 
g = B = 1. We display the relative errors in norms, (defined by — (p\\h'^/\\4>\\-L^), 
and their empirical variances. In Table 13.21 for the case g{x) = x and B{x) = x^, we 
give some results on standard errors for various values of n, and compare them to 
n~^/^, which is the order of magnitude of the expected final error on BN, since with a 
Gaussian kernel we have s = 1 in Proposition [TJ We see that our numerical results are in 
line with the theoretical estimates: indeed, the error on H is roughly twice as large as n~^^^. 
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-true BN with B=1 
■ reconstructed BN with B— 1 
true BN with B=B^ 

-reconstructed BN with B^B^ 
-true BN with B=B^ 
-reconstructed BN with B— B 





-O.pL < < • 1 O' • ' • ' ' 

(0 1 2 3 4 0.5 1 1 .5 2 

Figure 2: Reconstruction of BN (left) and of B (right) obtained with a sample of n = 5.W^ 
data, for three different cases of division rates B. 



Error on N: average Variance Error on -^(gN) : average Variance 
0.088 0.15 1).51 0.28 

Error on BN: average Variance Average h Average h 

039 029 0l2 OO 



n 


n 5 


h 


h 


error on N 


error on D 


error on 


10^ 


0.25 


0.1 


0.5 


0.06 


0.68 


0.42 


5.103 


0.18 


0.07 


0.3 


0.03 


0.45 


0.28 


10^ 


0.16 


0.08 


0.3 


0.035 


0.46 


0.29 


5.10^ 


0.11 


0.04 


0.2 


0.014 


0.31 


0.19 



4 Proofs 

In Section [4.11 we first give the proofs of the main results of Section [2l This allows us, in 
Section to prove the results of Section [231 which require the collection of all the results 
of Section [21 i.e. the oracle-type inequalities on the one hand and a numerical analysis 
result on the other hand. This illustrates the subject of our paper that lies at the frontier 
between these fields. Finally, we state and prove the technical lemmas used in Section [4. 1[ 
These technical tools are concerned with probabilistic results, namely concentration and 
Rosenthal-type inequalities that are often the main bricks to establish oracle inequalities, 



19 




-true N 
-n=1 000 
n=5 000 
n = 10 000 
-n=50 000 







true (gN)' 

- — n=1 000 

n=5 000 

n=10 000 

---n=50 000 










i 

















Figure 3: Reconstruction of N (left) and of -§;^{gN) (right) obtained for g{x) = x and 
B[x) = x^, for various sample sizes. 

and also the boundedness of In the sequel, the notation Dg^ gj.,,, denotes a generic 
positive constant depending on ^i, • • • (the notation □ simply denotes a generic positive 
absolute constant). It means that the values of ne^^ej,.-- ™3,y change from line to line. 

4.1 Proofs of the main results of Section [2] 
Proof of Proposition [2] 

For any h* & Ti, we have: 

||iV-iV||2 < ||iV^-iV^^,.||2 + ||iV;,,,.-iV,*||2 + ||iV,.-iV||2 

< AI + A2 + A3, 



with 



and 



:= \\Nf^-Nf^^,42<Aih*) + ^\\Kh, 

V nh 



KM 



Nh*h<A{h) + 



X 



V nh* 



\K\ 



:= ||iV^. - A^lb. 
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-true B 
-n = 1 000 

■ n=5 000 
n = 10 000 

■ n=50 000 




Figure 4: Reconstruction of BN (left) and of B (right) obtained for g{x) = x and B{x) 
x'^, for various sample sizes. 



We obtain 



||iV-iV||2 < Aih*) + ^\\K\\2 + ACh) + ^=\\K\\2 + \\Nh^-Nh 



< 2A{h*) + 



nh 
_2x_ 

Vnh* 



\/nh* 
||i^||2 + ||iV,.-Ar||2. 



(4.1) 



Since we have 
Aih*) 



sup {\\Nh*,h' - Nh'h 
h'en 



X 



'nh 



j\\Kh}_ 



< sup {{\\Nh*,h' - nNh^,h'] - {Nh' - nNh' 



X 



h'GH 



'nh' 



j\\K\\2}_ 



+ \\E[Nh^^h']-nNh']\\2} 



(4.2) 



and for any x and any h' 

E[Nh*^h'{x))-nNh'{x)] 



{Kh* *Kh'){x - u)N{u)du - J Kh'{x - v)N{v)dv 

Kh'{x -u- t)Kh'{t)N{u)dtdu - j Kh'{x - v)N{v)dv 
Kh* {v — u)Kh'{x — v)N{u)dudv — J Khi{x — v)N{v)dv 
j Kh'{x-v){ j Kh' {v-u)N{u)du- N{v))dv, 
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we derive 

\\E{Nh*,h')-HNh')h < ||Ali||S^.||2, 

where 

Eh*ix):={Kh**N){x)-N{x), x G R+ 
represents the approximation term. Combining (j4.ip . (j4.2p and (j4.3|) entails 

||iV-iV||2 < ||iV,,, -iV||2 + 2||K||i||i^^.||2 + ^^||K||2 + 2C„ 

\/nh* 



(4.3) 



with 

Cn 



sup {\\Nh*,h'-nNh*,h'] - {Nh, -E[Nh'])h - ^WKh} 



sup {\\Kh* * {N,,, - E[Nh']) - {Nh' - E[Nh'])h 
h'en 



nh' 

(l + .)(l + ||i^||i) 



WKh}. 



< (1 + lli^lli) sup {||iV,, - E[iV;,,]||2 - 



Hence 



IE[||iV-A^||^'^] <n,(E[||iV,.-Ar||^'']+||i^||f||i5;,.||2^ + X^^ 



where 



Now, we have: 



en = sup - E{Ny,)\\2 " ^^^\\Kh] ^. 

h'&n Vnh' 



ElWNh^-NWl"] < 2^'^-'{E[\\Nh*-E[Nh^]\\l'']+\\E[Nh^)-N\\^ 
Then, by setting 



2ql 



< 22^-1 (E [IliV;,* - E[iV;,.]||^''] + Hi?,.. 11^^). 
Kch*iX,,x) ■.= Kh'{x-Xi)- E{Kh*{x - Xi)), 



we obtain 

E[\\Nh*-nNh* 



< 



E 

29-1 
^2q 



n 



y^^Kch*{Xi,x)) dx 



2 N g 



i=l 



n .. 

+ e[| J Kch*iX„x)Kch*{X,,x)dx 
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Since 



J Kcl, {Xi,x)dx = J [Kh* {x - Xi) - E [Kh' [x - Xi)] ) ^ dx 

< 2(y" Kl{x-Xi)dx + j (w.[Kh*{x - Xi)])" dx) 

< 2{\\Kh4l + l^[Kl4^-Xi)]dx) 



< 4||E:,.||i = -||i^||i, 



the first term can be bounded as follows 



E[{[±KcUX.,x)dxy]<{^\\Kgy 

i=l 



For the second term, we apply Theorem 8.1.6 of de la Pena and Gine (1999) (with 2q > 2) 
combined with the Cauchy-Schwarz inequality: 



E 



J Kch*{Xi,x)Kch*{Xj,x)dx 

l<i,jr<n i^j 



[I E 



< ( Ell > I Kch*{Xi,x)Kch*{Xj,x)dx 

l<i,j<n iy^j 



2q 



<ngn'^(E[| / Kch*{Xi,x)Kch*{X2,x)d 



2g-| \ 2 



An, 



2\Q 



<n,n'?(E[| j Kcl,{X^,x)dx\^'']y < □,(^||i^||2) 
It remains to deal with the term E(^^''). By Lemma [1] below, we obtain 

and the conclusion follows. 
Proof of Proposition [3] 

The proof is similar to the previous one and we avoid most of the computations for 
simplicity. For any /lo € 

Wh-Dh < \\Dj,-Dj^^J\2 + \\Dj,^,^-D,j2 + \\Dko-Dh 
< A1 + A2 + A3, 
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with 



x_ 



and 
Then, 



As := \\Dh,-D\\2. 
\\D-, -Dh< 2A{ho) + -^\\g\U\K'\\2 + \\Dh, - D\\, 



To study A(/io), we first evaluate 

nDhiMi^)] - nKix)]- = {Kh,*Kh, * {gNy){x) - {Kh, * {gNYXx) 

= j D{t){Kh,*Kh,){x-t)dt- j D{t)Kh,{x-t)dt 

D{t) i Kh,{x-t-u)Kh2{u)dudt- / D{t)Kh2{x - t)dt 



D{t) J Kh,{v - t)Kh^{x - v)dvdt - J D{v)Kh^{x - v)dv 

Kh, {x-v)(^j D{t)Kh, {v - t)dt - D{v)'^ dv 

= {Kh2-kEh^){x), 
where we set, for any real number x 

Eh, (x) := j D{t)Kh, {x - t)dt - D{x) 

= {Kj,,*D){x)-D{x). (4.4) 

It follows that 

A{ho) = SUp{||Z);,o,;,-^,||2-^||5||oo||i^'||2} + 

< sup {{\\Dh,,h - IE[^ho,h] - {Dh - nDh])h - ^ll5lloo||i^'||2} + 



E[Dh,,h]-nDh]\\2} 



< sup {\\Dh,,h - nDho,h] - {Dh - nmh - ^ibiiooiii^'ib}^ + iii^iiiii^hoib 

hail ynh^ 

< (1 + lli^lli) sup - nDh]h - %^|blloo||i^'||2}+ + ||i^||i||^/.oll2, (4.5) 
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In order to obtain the last line, we use the following chain of arguments: 

. n 

^ i=i •' 

f 1 " 

= / Kh,{t)[-Y,9{X^)K{x-X,-t))dt 

i=l 

and 

E [bh,,h{x)\ = I Khoit)[ 1 9{u)K'^{x -u- t)N{u)duyt, 



therefore 

DhoMx) - E [DhoMx)] = J Kh,{t)G{x - t)dt = Kh,*G{x), 

with 

1 " r 

G{x) = -Y,9{Xi)K{x -Xi)- I g{u)K{x - u)N{u)du 

i=l 

= bh{x)-¥.[bh{x)\. 

Therefore 

< \\KUbh-E[bh]h, 

which justifies (j4.5p . In the same way as in the proof of Proposition [21 we can establish 
the following: 

E[\\E,,\\l'^] = E [11^,, - DWl''] < nJWE.jl'^ + ( W^Woo^W ' 

Finally, we successively apply (j4.4p . (j4.5p and Lemma [1] in order to conclude the proof. 
Proof of Proposition [4] 

We use the notation and definitions of Section 12. 4[ We have 

\\C^Hv)-^-\^)\\It = Y1 / {H^,k-C-Hv){x)ydx■.= Y,L^,k■ 
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We prove by induction that for all i, one has Li^k ^ C'^^Hy^H^j . The result follows by 
summation over i. We first prove the two following estimates: 



/ {(Pi,k-^{x)) dx < j^WifW^i, (4.6) 



\fi+i,k - ipiM <k\Mwi- (4.7) 

By definition, 99^^^ is the average of the function f on the interval [xj^^, Xj+i^^] of 
size ^. Thus (|4.7|) is simply Wirtinger inequality applied to 93 G on the interval 
[xi,k, t-]. For (j4.7p . we use the Cauchy-Schwarz inequality: 

^( / {^{x + -) -ip{x))dx] =-2 



- = y {(p{x + -) - ip{x))dxj =— ( / / ip'{z)dzdxf 

^i+i,k — 
k f f T \ T 



We are ready to prove by induction the two following inequalities: 



Li,fc<C72|^||(^||2 , (4.8) 



for two constants Ci and C2 specified later on. First, for i = 0, we have 



— -f^i.fcP < C'lill'/'llwi- (4-9) 



k k 

Lo,k = j \HQ^k{v) - C~^{ip){x)\'^dx = j \^LpQ^k- C^^{'^){x)\^dx. 




We recall (see Proposition A.l. of [1]) that C-'^{lp){x) = 2-2"9j(2-'^x), and we use the 

n=l 

00 

fact that 5 = 2~^" and for a,b > 0, ab < ^(a^ + 6^) in order to write 



n=l 

k k 



I J]2-2-(v9o,fc-9'(2-"x))|'(ix< 2-"'y |<^o^,-^(2-x)|^dx 

Q n=l n,n'=l q 

n — 7iT 

T 



3 47r2A;2 
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This proves the first induction assumption for i = 0, and 

proves the second one. Let us now suppose that the two induction assumptions are true 
for all j < i — 1, and take i>l. Let us first evaluate 

We distinguish the case when i is even and when i is odd. Let i be even: then, by definition 

•^i-\-l,k •^i-\-l,k 



•^i.k •^i,k 



by the induction assumption and Assertion 14.71 on for j = |. If i is odd, we write by 
definition 



Hence, re-organizing terms, we can write 

IT IT 

Putting together the four inequalities above (the estimates for (p and the induction as- 
sumptions), we obtain 

T\. ,,2 (Cl 1 Cl 1 

and (j4.8p is proved. It remains to establishe (|4.9p . Let us write it for i even (the case of 
an odd i is similar): 

2 1 I 1 2 I 1 2 

\Hi+ik — Hik\ = —\Hi+i — Hi_ + Lpt+i — ipi_\ = —\H — H i_ + ip — if i_\ . 

15 2 2 2 2 32 2^ 2 2^ 2 
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Hence, as previously, we obtain 

|i?i+i,fc-i/.,fcP<^^||v^||^i(C| + l). 

To complete the proof, we remark that C2 = and Cf = + 1(1 + ^) < g are suitable. 
It is consequently sufficient to take C = Ci. 

4.2 Proof of Theorem [1] and Proposition [1] 
Proof of Theorem [1] 

It is easy to see that 

\\H-Hh,T = \\Cl\kb + \nN)-C-\C{BN))\\2,T 

< \\C~^\kb + KN) - Cl\C{BN))h,T 

+ \\Cl\C{BN)) - C-\C{BN))h,T 

< \\C-^\kb + \nN -{kD + \N))\\2,T 

+ ^l=\\C{BN)\\y,., 

thanks to Proposition HI Note that C{BN) = K,{gNy + AA^ so that we can write 

\\C{BN)\\y,,<C{\\N\\y,, + \\gN\M. 
We obtain, thanks to Lemma H] that gives the boundedness of the operator C^^ : 

\\H-Hh,T < □(||/i„^-Ki?||2^^ + ||A„iV-AiV||2^^ + (||iV|Ui + ||5iV|U2)^) 

< n{\kn\\\b - Dh + |A„|||iV - iV||2 + \kn - k\\\D\\2 + \\n " A|||iV||2 

+ (||iV||^, + ||<^iV||^.)^) 

< □(|A„||p„|||^ - Dh + \\n\{\\N - Nh + \pn - Pg{N)\\\Dh) 

+ (IliVlb + |p,(iV)|||I)||2)|A. - A| + mWy,. + \\gN\M^). 

Taking expectation and using Cauchy-Schwarz inequality, we obtain for any 9 > 1, 

n\H H\\1.t\ < [{E[Xl'^])'^'{{E[pt'^]f\E[\\b - D\\p]f' + {E[\\N - N\\l''])'^' 
+ \\D\\l{E[\p^-p,{N)\^'^]Y^'} 

+ {\\Nh + p,(iV)||i?||J'E[|A„ - An + ((llA^IIwi + WgNM^Y]. 
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Now, Lemma [3] gives the behaviour of E[|/5„ — pg{N)\'^'^]. In particular, we obtain 

We finally apply successively Propositions [2] and [3] to obtain the proof of Theorem [TJ 
Proof of Proposition [1] 

We have already proved (|1.9p . It remains to prove (|1.10p . We introduce the event 

Qn = {2N{x) > m for any x G [a, b]}. 
Then, for n larger that Q^, 



E 

= E 

< E 

< E 

< E 



[B{x) - B{x)fdx 



B{x)-Bix)ydxxlnJ +E[(/ (^(x) - 5(x))'dx x If^c ) 



Bix) - B{x)Ydx X lf^„ r + (2(6 -a){n + Q^)) ^ P(J7^) 



N 



n) 



X In, 



HN -NH\'i 



V 



X Ic 



+ (4n(6-a))5P(J)^) 
+ (4n(6 - a)) 2 P(f)^) 



NN ' 

< □,,„^,M,Q(]E - i^llJ] +E [||iV - A^IIJ]) + (4n(6 - a))^ P(J7^). 

The first term of the right hand side is handled by (11.9P and Proposition [2j The second 
term is handled by Lemma[2]that establishes that P(f^^) = 0(11"''). 

4.3 Technical lemmas 
Concentration inequalities 

We first state the following concentration result. Note that a more general version of this 
result can be found in [13]. We nevertheless give a proof for the sake of completeness. 



Lemma 1. We have the following estimates 



Assume that \\K\\^, \\g\ 
the grid % C {1, 1/2, 1/-Dmax} cif^d D 
77 > 

E[sup{||iV;,-E[iV;,]||2-^i^ 



and \\N\\ are finite. For every q > 0, introduce 



00 

max 



6n for some 6 > 0. Then, for every 
i^lbl^] < □g,^,5||A'||2,||X||i,|lAf|loo"~''- 
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Assume that \\K'\\2,\\K'\\i,\\g\\cx:, and ||A^||oo are finite. For every q > 0, introduce 

the grid % C {1, 1/2, 1/-Dmax} and -Dmax = for some (5 > 0. Then for every 
T] > 

(1 + ^) 

\\\^h - '^iJ^hm - 

hen 



E [sup (III), -E[^,]|b - ^||<7lUI|i^'l|2}^^] < a,,,^,,,..',,.,,,/^,,,,!,.,,,^,!.' 



Proof. Let X be a real random variable and let us consider the random process 

VtGM, w{t,X) = ip{X)'^{t-X), 

where ip and ^ are measurable real- valued functions. Let Xi,...,Xn be n independent 
and identically distributed random variables with the same distribution as X and let us 
consider the process 

n 

VieM, ^^^^{t) = Y,{Ht,Xi)-E[w{t,X)]). 
1=1 

First, let us study the behavior of 

1/2 



||?</3,*||2 = ( j ^l,^i't)dt^^ 
If B denotes the unit ball in and ^ is a countable subset of i3, we have 

sup / a{t)£,^^^{t)dt 



= sup / a{t)^^^^{t)dt 

a&A J 

n „ 

= supV I a{t){w{t,Xi) -E[w{t,X)])dt. 

Hence one can apply Talagrand's inequality (see the version of Bousquet in the independent 
and identically distributed case [161 P 170]). For all e,x > 0, one has: 

P(||C^,*||2 > {l + e)¥.[\\i^,^\\^] + V^ + c{e)bx) < e-^ 

where c(e) = 1/3 + , 

7; = nsupE[( / a{t)[w{t,X) -¥.{w{t,X))]dtf], 



and 



b= sup a{t)[w{t,y) -E{w{t,X))]dt. 
We study each term of the right hand term within the expectation. 
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Obviously, one has: 

2 \l/2 

i=l 



•' i=i 

= ( / j2^[{w{t,Xi) - ^[w{t,X)\f]dt)"'' 

i=l 

<[n [w{t,Xf]dt^^''^ . 
But we easily see that for all y, 

w\t,y)dt<MUnl (4.10) 



hence 

Since f is a supremum of variance terms, 

V < n supE [{ a{t)w{t,X)dt)^] 

< nsupE[ / \w{t,X)\dt / a'^{t)\w{t,X)\dt] 



< nsup / \w{t,y)\dt X supK[\w(t,X)\] 
3/eK J teK 



ooll^^ llooll^lll- 

The Cauchy-Schwarz inequality and (j4.10p give 

b = sup\\w{,,y)-E{w{.,X)]\\2 

,V2 



< sup||u;(.,2/)||2 + (E[ / w{t,Xfdt]y 

yeR J 

< 2||^||oo||*||2. 



The main point here is that ^/v may be much smaller than IE[||^(p,5i ||2]- So, for all e,x > 0, 
P (Ilev.,*ll2 > (1 + e)V^||(/j||oo||*||2 + ||^||oo||iV||^'||^'||iV2^ + 2c(£)||<^||oo 11*1122;) < e'^ 
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Now we consider a family Ai of possible functions if and 4*. Let us introduce some strictly 
positive weights L^^<j, and let us apply the previous inequality to x = L^^if + n for n > 0. 



Hence with probability larger than 1 — ^i)eM ^ ■^'^'*e for all {(p, G ^A, one has 



U^M\2 < (l + e)\/^||</'||oo||^||2 + ||<^||oo||A^||^'||*||iy2;^ + 2c(e)||99|U 

+ ll9'lloo||iV||^/'||^'||l^/2;^ + 2c(e)||(^||oo||*||2U. (4.11) 

Let 

M^,* = (1 + e)V^||(^|U||^||2 + ||^||oo||iV||^'||^||iv/2;^ + 2c(e)||(^||oo||^||2^^,vi>. 
It is also easy to obtain an upper bound of Rq for any q > 1 with 

Rg = E[ sup {U^.^h - M^^^f''] = P( sup {U^M\-2- M^.^f'^ >x)dx. 

{'P,'S')eM Jo (<p,*)eA4 

Indeed 

/ + 00 
r{{\\^^,^\\^-M^,^f_^ >x)dx. 

Then, let us take u such that 

x = /(u)2^ :=(||(^|UI|iV||L/'ll^lli^^+2c(e)||^||oo||^||2n)"', 



so 



dx = 2q{f{u))^' '(x/2;^||^||oo||iV||^2||*||i^ + 2c(e)||v9||oo||^||2)c^u. 



Hence 

n + OO 



E X e-(^-*+")2<7(/(^))''^-^(^/2;^||^|U||^^||J/'||^||l^ + 2c(e)||^|Uy^ 



r+oo 

/ 7rZ-kA -JO Jo 



(l/3,*)G>I 



Finally, we have proved that 



i?,<n,,, Y e-^-*[n^l|A^ll^ll^llL^ll^lli' + ll¥'llL'll^ll2l- (4.12) 
Now let us evaluate what this inequality means for each set-up. 
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First, when 99 = 1/n and ^ = l/hK{»/h), the family M corresponds to the family 
Ti. In that case M^^ij, and L^^^ will respectively be denoted by AI^ and L/^. The 
upper bound given in (|4.12p becomes 

R, < e"^'^ [n-'mi^ + n~'''h~''\\K\\l'^] . (4.13) 

hen 



Now it remains to choose Lh- But 

M, = (1 + e)^\\Kh + ||iV||V2||^|| + 2c(.)^||i^||2L.. 

Vn/i V n nVh 



Let 6* > and let 

Lh 



2\\N\U\K\\lVh' 

Obviously the series in (I4.13P is finite and for any h £ T-L, since h < 1, we have: 



\\N\U\K\\inh 
Since -Dmax = ^'t-, one obtains that 

c{e)9^K\\IVSjKh 



Mh< (1 + 8 + 9 + 



\\N\U\K\\l > ^ 



It remains to choose e = 7?/2 and 6 small enough such that 6 + '^'^ffj,, '^^^}}^u^ = 

II JV ||oO II" II ]^ 

to obtain the desired inequality. 

Secondly, if 99 = g/n and ^ = 1 / K' {• / h)t\\.e family M. corresponds to the family 
"H. So, M(p^<j, and L^p^^^ will be denoted by and L'^ respectively. The upper bound 
given by (j4.12p now becomes 

R,<U,,eY._^-'^''[^^''^^^^N\\U9&^^ (4.14) 
But 

1 I2T' T' 

= (l+e)-^=^|MU||A-'||2 + N|oo||A^||L/'y^ 

Let 6* > and let 

^ 2\\N\U\K'\\lh 
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Obviously the series in ()4.14p is finite and we have: 
But h"^ > {6ny^. Hence 

As previously, it remains to choose e and 9 accordingly to conclude. 

■ 

The second result is based on probabilistic arguments as well. 

Lemma 2. Under Assumptions and notations of PropositionUl if there exists an interval 
[a, b] in (0, T) such that 

[m,M]:=[ inf N{x), sup iV(x)] C (0, oo), Q := sup \H{x)\<oo, 

x6[a,6] x£[a,b] xe[a,b] 

and i/ln(n) < Z?min < ?iV(2mo+i) and n^/^ < D^ax < (n/ ln(7i)) y^r some r? > 

fixed, then there exists Cn, a constant depending on rj, such that for n large enough, 

Proof. Let x, y be two fixed point of [o, b] and for h G Ti' , first let us look at 

Zh{x,y) = Nh{x)-my) - [Kh^N{x)-Kh^N{y)]. 
One can apply Bernstein inequality to Zh{x, y) (see e.g. (2.10) and (2.21) of [16]) to obtain: 

E [e^^^(-)] < exp [-^^^^^^ VAG (0,l/c(x,y)), 



with 



and 



c{x, y) = ^ sup \Kh{x -u) - Kh{y - u) 
'in um 



v\x,y) = - I \Kh{x-u)-Kh{y-u)\^N{u)du. 



But, with /C the Lipschitz constant of K, 
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and 



X;2 

v'^{x,y) < —\x - yp. 



Let xq be a fixed point of [a, b]. One can apply Theorem 2.1 of [2] whicli gives that, if 



Zh = sup \Zh{x,xo)\ 

x&[a,b] 



then for all positive x, 



¥{Zh > 18{^/v^x + l) + c(x + 1)) < 26"'', 

with 

2 '^1!, |2 

f = ^ a 

and 

^ I/, I 

But one can similarly prove, using Bernstein's inequality, that 



P (|iV,(xo) -Kh* N{xo)\ > Y + Ok^) < 2e- 

Hence 

F(3x^[a,b],\NUx)-K\*N{x)\> \[^^ + £2 ^ 
We apply this inequality with x = q\og{n) + D''\ D = -0^;^, • • • , D'^^^. We obtain 



-Dmax 



D=l 



Consequently, since ^°^i$n — ?• uniformly in h £ Ti' , for n large enough, 

P(3/i G n',3x G [a,5],|iV?,(x) -i^/,*iV(x)| > m/4) < D^n-^. 
But since iV G s > 1, then iV' G so iV' is bounded on M. Then 

E:fc*iV(x)-iV(x) = j Kh{x-y){N{y)-N{x))dy 

K{^){^) t N' {x + u{y - x))dudy 







h h 
h I tK{t) j N'{x -uht)dudt 

35 



So, 

\N{x)-Kh*N{x)\ <nN,Kh. 

Because of the definition of Ti' this term tends to 0, uniformly in x and h. Hence for n 
large enough, for all x G [a, b] and all h £ Ti' 

Kh * N{x) > N{x) - m/4 > 3m/4. 

Consequently, for n large enough, 

F{3h G n',3x £ [a,b],2Nh{x) < m) < □^n"'?. 



Rosenthal-type inequality 

The following result studies the behavior of the moments of /5„. 
Lemma 3. For any p > 2, 

E[\pn-pg{N)\P] <ap,g,N,cn-P/^. 



Proof. We have: 
with 

and 



A„. := E 



B„ := E 



E[\pn- Pg{N)\P] <np(yl„ + i?„,), 



n J g{x)N {x)dx + c g{x)N{x)dx 



Yl7=i9iXi) + c n f g{x)N{x)dx + c 

We use the Rosenthal inequality (see e.g. the textbook [7]): if -Ri, . . . , -Rn are independent 
centered variables such that E[|i?i|P] < oo, with p > 2 then 



1 



(4.15) 



1=1 
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We recall that Assumption [T] ensures that IE[X|'] = J x^N(x)dx < oo, for any p > 1. 
Hence, for the first term An, using ()4.15p . we have: 



A,. 



1 \r^n Y 
n 1^1=1 



< 



J g{x)N{x)dx + ^ 4^ g{x)N{x)dx 

I 1 " C 
Ug^M,M\-y^Xi / g{x)N{x)dx- 



xN{x)dx{^ I g{x)N{x)dx + 



n 



LI n ^-^ 

i=l 

p/2 



Let us turn to the term 

Er=i X. 



xN{x)dx 



Br, 



ELi aiXi) + c n / g{x)N{x)dx + c ' ^ 

< [\'-tM"']f'' -m n_Jn ^j^'/""!", , .^ r]f" 



1=1 



T.U9{Xd + i){!g{x)N{x)dx + ^) 



< hNV\n]^±X,\'']f'\^[\ ^^^'^^^ 



iT:U9{x^) + ^ 



1 " 



2p\l/2 



j=l 



^E"=i5'(^i) - ! 9{x)N{x)d3^^2p^^i/2 



(E[ 
< □p,-;,iv(]E[| 



IT.U9{X.) + ^ 

\TJl=i9{Xi) - ! g{x)N{x)dx.2p^.i/2 



1)^ 



Now, we set for 7 > 3p 



1 " /■ 
^ i=i •' 



27Var(g(Xi))logn 7||c/||oo log n 



n 



+ 



3n 



(recall that Assumption [T] states that ||g'||oo < 00, which also implies E[g'(Xi)^] < 00) 
Since g is positive, the Bernstein inequality (see Section 2.2.3 of |16| ) gives: 



> 1 - 2?!^^. 



(4.16) 
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Therefore, we bound from above the term Bn by a constant □p,g,Ar times 



E 



1/2 



where we have used (14.150 for the first term and ()4.16p for the second one. This concludes 
the proof of the lemma. ■ 



The boundedness of £^ ^ 

Lemma 4. For any function ip, we have: 



Proof. We have: 



i=o i=o\ 2' 2' y 

\ 3=0 i=l / 

At the second hue, we have distinguished the i's that are even and the i's that are odd. 
At the third hne, we have used the inequahties (o + 6)^ < 2a^ + 26^ and {a + b + c + d)"^ < 
4(a^ + 6^ + + d^). By substraction, we obtain 



k—l 

I \ClH^){x)\''dx<\^Y.^l,. 



The Cauchy-Schwarz inequality gives: 
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so that 



k-i ^ k-l ,^2 



J Yl = jJ2T^i / ^ix)dxf < ip'^ {x)dx 
^=0 xi, 
and finally we obtain the desired result: 



\Cl'^{Lp){x)\^dx < ^ / ip^{x)dx. 
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